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Absolute temperature, the fundamental temperature scale in 
thermodynamics, is usually bound to be positive. Under special 
conditions, however, negative temperatures - where high-energy 
states are more occupied than low-energy states - are also pos- 
sible. So far, such states have been demonstrated in localized 
systems with finite, discrete spectra. Here, we were able to 
prepare a negative temperature state for motional degrees of 
freedom. By tailoring the Bose-Hubbard Hamiltonian we cre- 
ated an attractively interacting ensemble of ultracold bosons 
at negative temperature that is stable against collapse for arbi- 
trary atom numbers. The quasi-momentum distribution devel- 
ops sharp peaks at the upper band edge, revealing thermal equi- 
librium and bosonic coherence over several lattice sites. Neg- 
ative temperatures imply negative pressures and open up new 
parameter regimes for cold atoms, enabling fundamentally new 
many-body states and counterintuitive effects such as Carnot 
engines above unity efficiency. 

Absolute temperature T is one of the central concepts of 
statistical mechanics and is a measure of e.g. the amount of 
disordered motion in a classical ideal gas. Therefore, nothing 
can be colder than T = 0, where classical particles would be 
at rest. In a thermal state of such an ideal gas, the probability 
Pi for a particle to occupy a state i with kinetic energy Ei is 
proportional to the Boltzmann factor, 



-Ei/k B T 



(1) 



where &b is Boltzmann's constant. An ensemble at positive 
temperature is described by an occupation distribution that 
decreases exponentially with energy. If we were to extend this 
formula to negative absolute temperatures, exponentially in- 
creasing distributions would result. Because the distribution 
needs to be normalizable, at positive temperatures a lower 
bound in energy is required, as the probabilities Pi would 
diverge for Ei — >> — oo. Negative temperatures, on the other 
hand, demand an upper bound in energy [1, 2]. In daily life 
negative temperatures are absent, as kinetic energy in most 
systems, including particles in free space, only provides a 
lower energy bound. Even in lattice systems, where kinetic 
energy is split into distinct bands, implementing an upper 
energy bound for motional degrees of freedom is challeng- 



a r) +o 


>0 


+oo )) -oo 


<0 


-0 ) 


-j6> -oo 









+oo ) 




^rnin 




Energy 


^rnax 




T, U, V>0 


II 


T,U,V<0 



4J 



-4J 



|#w/\/v 



T, U, V>0 



T, U,V<0 



i 




Px 



Fig. 1. Negative temperature in optical lattices. (A) Sketch of en- 
tropy as a function of energy in a canonical ensemble possessing both 
lower (E m \ n ) and upper (E max ) energy bounds. Insets: sample occu- 
pation distributions of single-particle states for positive, infinite, and 
negative temperature, assuming a weakly interacting ensemble. (B) 
Energy bounds of the three terms of the two-dimensional (2D) Bose- 
Hubbard Hamiltonian: kinetic (E\^\ n ), interaction (E int ), and potential 
(Ep t) energy. (C) Measured momentum distributions (TOF images) 
for positive (left) and negative (right) temperature states. Both im- 
ages are averages of about 20 shots, both optical densities (OD) are 
individually scaled. The contour plots below show the tight-binding 
dispersion relation, momenta with large occupation are highlighted. 
The white square in the center indicates the first Brillouin zone. 



ing, as potential and interaction energy need to be limited as 
well [3, 4]. So far, negative temperatures have been realized 
in localized spin systems [5, 6, 7], where the finite, discrete 
spectrum naturally provides both lower and upper energy 
bounds. Here, we were able to realize a negative tempera- 
ture state for motional degrees of freedom. 

In Fig. 1A we schematically show the relation between en- 
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Fig. 2. Experimental sequence and TOF images. (A) Top to bottom: lattice depth, horizontal trap frequency, and scattering length as a 
function of time. Blue indicates the sequence for positive, red for negative temperature of the final state. (B) TOF images of the atomic 
cloud at various times t in the sequence. Blue borders indicate positive, red negative temperatures. The initial picture in a shallow lattice at 
t = 6.8 ms is taken once for a scattering length of a = 309(5) ao (top) as in the sequence, and once for a = 33(1) ao (bottom, OD rescaled 
by a factor of 0.25), comparable to the final images. All images are averages of about 20 individual shots. See also Fig. 1C. 



tropy S and energy E for a thermal system possessing both 
lower and upper energy bounds. Starting at minimum en- 
ergy, where only the ground state is populated, an increase 
in energy leads to an occupation of a larger number of states 
and therefore an increase in entropy. As the temperature ap- 
proaches infinity, all states become equally populated and the 
entropy reaches its maximum possible value ^max- However, 
the energy can be increased even further if high-energy states 
are more populated than low-energy ones. In this regime the 
entropy decreases with energy, which, according to the ther- 
modynamic definition of temperature [8] (1/T = dS/dE), 
results in negative temperatures. The temperature is dis- 
continuous at maximum entropy, jumping from positive to 
negative infinity. This is a consequence of the historic defi- 
nition of temperature. A continuous and monotonically in- 
creasing temperature scale would be given by —f3 = — 1//cbT, 
also emphasizing that negative temperature states are hotter 
than positive temperature states, i.e. in thermal contact heat 
would flow from a negative to a positive temperature system. 

As negative temperature systems can absorb entropy while 
releasing energy, they give rise to several counterintuitive ef- 
fects such as Carnot engines with an efficiency greater than 
unity [4] . Via a stability analysis for thermodynamic equilib- 
rium we showed that negative temperature states of motional 
degrees of freedom necessarily possess negative pressure [9] 
and are thus of fundamental interest to the description of 
dark energy in cosmology, where negative pressure is required 
to account for the accelerating expansion of the universe [10]. 

Cold atoms in optical lattices are an ideal system to 
create negative temperature states because of the isolation 
from the environment and independent control of all rele- 
vant parameters [11]. Bosonic atoms in the lowest band of 
a sufficiently deep optical lattice are described by the Bose- 
Hubbard Hamiltonian [12] 

H = -J^blb j + ^^2h i (ni-l) + V^r 2 i n i . (2) 

i i 



Here, J is the tunneling matrix element between neighbor- 
ing lattice sites and hi and h\ are the annihilation and 
creation operator, respectively, for a boson on site z, U is 
the on-site interaction energy, hi = h\hi the local number 
operator, and V oc cj 2 describes the external harmonic con- 
finement with Yi denoting the position of site i with respect 
to the trap center and uj the trap frequency. 

In Fig. IB we show how lower and upper bounds can be 
realized for the three terms in the Hubbard Hamiltonian. 
The restriction to a single band naturally provides upper and 
lower bounds for the kinetic energy i^kin, but the interaction 
term E- lVL t presents a challenge: because in principle all bosons 
could occupy the same lattice site, the interaction energy can 
diverge in the thermodynamic limit. For repulsive interac- 
tions (U > 0), the interaction energy is only bounded from 
below but not from above, thereby limiting the system to 
positive temperatures; in contrast, for attractive interactions 
(U < 0) only an upper bound for the interaction energy is 
established, rendering positive temperature ensembles unsta- 
ble. The situation is different for the Fermi Hubbard model, 
where the Pauli principle enforces an upper limit on the in- 
teraction energy per atom oiU/2 and thereby allows negative 
temperatures even in the repulsive case [13, 14]. Similarly, 
a trapping potential V > only provides a lower bound for 
the potential energy E po t, while an anti-trapping potential 
V < creates an upper bound. Therefore, stable negative 
temperature states with bosons can exist only for attractive 
interactions and an anti-trapping potential. 

In order to bridge the transition between positive and neg- 
ative temperatures we used the n — 1 Mott insulator [15] 
close to the atomic limit (\U\/J — »• oo), which can be ap- 
proximated by a product of Fock states \^f) = IIi&J|0). As 
this state is a many-body eigenstate in both the repulsive 
and the attractive case, it allows us to switch between these 
regimes, ideally without producing entropy. The employed 
sequence (Fig. 2A) is based on a proposal by Rapp et al. [4], 
building on previous ideas by Mosk [3] . It essentially consists 
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of loading a repulsively interacting Bose-Einstein condensate 
(BEC) into the deep Mott insulating regime (I in Fig. 2A), 
switching U and V to negative values (II) , and finally melting 
the Mott insulator again by reducing \U\/J (III). For com- 
parison, we also created a final positive temperature state 
with an analog sequence. 

The experiment started with a BEC of 1.1(2) x 10 5 39 K 
atoms in a pure dipole trap with horizontal trap frequency 
c^dip (V > 0) at positive temperature (T > 0) and a scat- 
tering length of a = 309(5) ao, with ao the Bohr radius. We 
ramped up a three-dimensional optical lattice (I) with sim- 
ple cubic symmetry to a depth of V\&t = 22(1) E r . Here 
E T — h 2 /(2m\f at ) is the recoil energy with Planck's con- 
stant h, the atomic mass ra, and the lattice wavelength 
Aiat = 736.65 nm. The blue-detuned optical lattice provides 
an overall anti-trapping potential with a formally imaginary 
horizontal trap frequency £Ji a t that reduces the confinement 
of the dipole trap, giving an effective horizontal trap fre- 
quency cJhor = yA^dip + ^i 2 at • Once the atoms are in the 
deep Mott insulating regime where tunneling can essentially 
be neglected (tunneling time r = h/(2nJ) — 10(2) ms), we 
set the desired attractive (repulsive) interactions (II) to pre- 
pare a final negative (positive) temperature state using a 
Feshbach resonance [16]. Simultaneously, we decreased the 
horizontal confinement to an overall anti-trapping (trapping) 
potential by decreasing cjdip. Subsequently, we decreased 
the horizontal lattice depths (III), yielding a final value of 
JJ I J — —2.1(1) (+1.9(1)), and probed the resulting momen- 
tum distribution by absorption imaging after 7 ms time-of- 
flight (TOF). The whole sequence was experimentally opti- 
mized to maximize the visibility of the final negative temper- 
ature state. We chose a 2D geometry for the final state in 
order to enable strong anti-trapping potentials and to avoid 
detrimental effects due to gravity [9] . 

In Fig. 2B we show TOF images of the cloud for various 
times t in the sequence, indicated in Fig. 2A. During the ini- 
tial lattice ramp (at VJ a t = 6.1(1)-E r ), interference peaks of 
the superfluid in the lattice can be observed (t = 6.8 ms top). 
Because quantum depletion caused by the strong repulsive 
interactions already reduces the visibility of the interference 
peaks in this image [17], we also show the initial superfluid for 
identical lattice and dipole ramps, but at a scattering length 
of a — 33(1) ao (t = 6.8 ms bottom). The interference peaks 
are lost as the Mott insulating regime is entered (t = 25 ms) . 
In the deep lattice only weak nearest-neighbor correlations 
are expected, resulting in similar images for both repulsive 
and attractive interactions (t = 28 ms). After reducing the 
horizontal lattice depths back into the superfluid regime, the 
coherence of the atomic sample emerges again. For positive 
temperatures the final image at t = 30.5 ms is comparable, 
albeit somewhat heated, to the initial one at t — 6.8 ms, 
whereas for attractive interactions sharp peaks show up in 
the corners of the first Brillouin zone, indicating macroscopic 
occupation of maximum kinetic energy. The spontaneous de- 
velopment of these sharp interference peaks is a striking sig- 
nature of a stable negative temperature state for motional 
degrees of freedom. In principle, the system can enter the 
negative temperature regime following one of two routes: it 
either stays close to thermal equilibrium during the entire 



sequence or alternatively relaxes towards a thermal distribu- 
tion during lattice ramp-down. Either way demonstrates the 
thermodynamic stability of this negative temperature state. 
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Fig. 3. Occupation distributions. The occupation of the kinetic en- 
ergies within the first Brillouin zone is plotted for the final positive 
(blue) and negative (red) temperature states. Points: experimen- 
tal data extracted from band-mapped pictures. Solid lines: fits to a 
non-interacting Bose-Einstein distribution assuming a homogeneous 
system. Insets: top row: symmetrized positive (left) and negative 
(right) temperature images of the quasi-momentum distribution in 
the horizontal plane. Bottom row: fitted distributions for the two 
cases. Note that all distributions are broadened by the in situ cloud 
size [9]. 

In order to examine the degree of thermalization in the 
final states, we used band-mapped [18] images and extracted 
the kinetic energy distribution assuming a non-interacting 
lattice dispersion relation .Ekin(ax, q y ). The result is shown 
in Fig. 3, displaying very good agreement with a fitted 
Bose-Einstein distribution. The fitted temperatures of T = 
— 2.2J/&B and T = 2.7J/&B for the two cases only represent 
upper bounds for the absolute values \T\ of the average tem- 
perature because the fits neglect the inhomogeneous filling of 
the sample [9] . Both temperatures are slightly larger than the 
critical temperature |Tbkt| ~ 1.8J//cb [19] for the superfluid 
transition in an infinite 2D system but lie below the con- 
densation temperature |Tc| = 3.4(2) J/Jzb of non-interacting 
bosons in a 2D harmonic trap for the given average density 
[9]. 

Ideally, entropy is produced during the sequence only in 
the superfl uid/normal shell around the interim Mott insula- 
tor: while ramping to the deep lattice, the atoms in this shell 
localize to individual lattice sites and can subsequently be 
described as a \T\ — oo system [14]. Numerical calculations 
have shown that the total entropy produced in this process 
can be small [4] , as most of the atoms are located in the Mott 
insulating core. We attribute the observed additional heat- 
ing during the sequence to non-adiabaticities during lattice 
ramp-down and residual double occupancies in the interim 
Mott insulator. 

While the coherence length of the atomic sample can in 
principle be extracted from the interference pattern recorded 
after a long TOF [20], the experiment was limited to finite 
TOF, where the momentum distribution is convolved with 
the initial spatial distribution. By comparing the measured 
TOF images with theoretically expected distributions, we 
were able to extract a coherence length in the final negative 
temperature state of 3 to 5 lattice constants [9]. 

In order to demonstrate the stability of the observed nega- 
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Fig. 4. Stability of the positive (blue) and negative (red) temperature 
states. Main figure: visibility V = (rib — n r )/(n^ + n r ) extracted from 
the atom numbers in the black (n^) and red (n r ) boxes (indicated in 
the TOF images) plotted versus hold time in the final state for vari- 
ous horizontal trap frequencies. Dark red: \uj\^ oy \/2t: = 43(1) Hz anti- 
trapping, medium red: 22(3) Hz anti-trapping, light red: 42(3) Hz 
trapping, blue: 45(3) Hz trapping. Inset: coherence lifetimes r ex- 
tracted from exponential fits (solid lines in main figure). The statis- 
tical error bars from the fits are smaller than the data points. The 
color scale of the images is identical to Fig. 2B (see also Fig. S3). 



ble bosonic ensemble at attractive interactions for arbitrary 
atom numbers; the negative temperature stabilizes the sys- 
tem against mean-field collapse that is driven by the negative 
pressure. 

Negative temperature states can be exploited to investi- 
gate the Mott insulator transition [22] as well as the renor- 
malization of Hubbard parameters [23, 24] for attractive in- 
teractions. As the stability of the attractive gas relies on 
the bounded kinetic energy in the Hubbard model, it natu- 
rally allows a controlled study of the transition from stable 
to unstable by lowering the lattice depth, thereby connecting 
this regime with the study of collapsing BECs [25], which is 
also of interest for cosmology [26]. Negative temperatures 
also significantly enhance the parameter space accessible for 
quantum simulations in optical lattices, as they enable the 
study of new many-body systems whenever the bands are 
not symmetric with respect to the inversion of kinetic energy. 
This is the case e.g. in triangular or Kagome lattices, where in 
current implementations [27] the interesting flat band is the 
highest of three sub-bands. In fermionic systems, negative 
temperatures enable e.g. the study of the attractive SU(3) 
model describing color superfluidity and trion (baryon) for- 
mation using repulsive 173 Yb [28], where low losses and sym- 
metric interactions are expected but magnetic Feshbach res- 
onances are absent. 



tive temperature state, Fig. 4 shows the visibility of the inter- 
ference pattern as a function of hold time in the final lattice. 
The resulting lifetime of the coherence in the final negative 
temperature state crucially depends on the horizontal trap 
frequencies (inset): lifetimes exceed r — 600ms for an op- 
timally chosen anti-trapping potential, but an increasingly 
fast loss of coherence is visible for less anti-trapping geome- 
tries. In the case of trapping potentials, the ensemble can 
even return to met ast able positive temperatures, giving rise 
to the small negative visibilities observed after longer hold 
times (Fig. S4). The loss of coherence probably originates 
from a mismatch between the attractive mean-field and the 
external potential, which acts as an effective potential and 
leads to fast dephasing between lattice sites. 

The high stability of the negative temperature state for 
the optimally chosen anti-trapping potential indicates that 
the final chemical potential is matched throughout the sam- 
ple such that no global redistribution of atoms is necessary. 
The remaining slow decay of coherence is not specific to the 
negative temperature state as we also observe comparable 
heating for the corresponding positive temperature case (blue 
data in Fig. 4 as well as the initial superfluid in the lattice. It 
probably originates from three-body losses and light- assisted 
collisions. 

To summarize, we have created a negative temperature 
state for motional degrees of freedom. It exhibits coherence 
over several lattice sites, with coherence lifetimes exceeding 
600 ms, and its quasi-momentum distribution can be repro- 
duced by Bose-Einstein statistics at negative temperature. In 
contrast to met ast able excited states [21], this isolated nega- 
tive temperature ensemble is intrinsically stable and cannot 
decay into states at lower kinetic energies. It represents a sta- 
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Supplementary materials 



Negative pressure 

In order to investigate the conditions for thermal equilib- 
rium, we consider a gas in a box of volume Vbox with fixed 
total energy E. In thermal equilibrium the entropy S takes 
the maximal possible value under the given constraints. We 
note that the size of the box is only an upper limit for the 
volume V of the gas, V < Vbox- In most cases this leads to 
the condition of a positive pressure P > 0, forcing the gas to 
fill the whole box, V = Vbox [29]. More generally, however, 
the maximum entropy principle only requires 



dS_ 
dV 



> 0. 



(SI) 



If this derivative was negative, the system could sponta- 
neously contract and thereby increase its entropy. As a 
consequence, there would be no equilibrium solution with 
V = Vbox but instead the system would be unstable against 
collapse. 

From the total differential of energy dE — T dS — P dV 
one obtains 

dS = i dE + £ dV, (S2) 



which leads to 



as 

dV 



P 
T 



(S3) 



Thus absolute pressure and temperature necessarily have the 
same sign in equilibrium, P/T > 0, i.e. negative temperatures 
imply negative pressures and vice versa. 

Since attractive interactions naturally lead to negative 
pressures, this illustrates why an attractively interacting 
BEC at positive temperatures is inherently unstable against 
collapse [30]. This is in contrast to the case studied here, 
where the negative temperature stabilizes the negative pres- 
sure system against collapse. 



Experimental sequence 

The sequence employed in this work is illustrated in Fig. 
2 A in the main text. We prepared a condensate of N = 
105(14) x 10 3 39 K atoms in a dipole trap with trapping fre- 
quencies of cjhor = 27r x 37(1) Hz and cj V ert = 27T x 181(12) Hz 
along the horizontal and vertical directions, respectively. The 
horizontal trap frequency indicates the root mean square 
of the trap frequencies along the two horizontal directions. 
There is no detectable thermal fraction in TOF images of 
the condensed cloud. We created a Mott insulator by lin- 
early ramping up a three-dimensional optical lattice (I in 
Fig. 2A) at a wavelength of Ai a t = 736.65 nm within 25 ms 
to a lattice depth of Vi a t ~ 22(1) E r . The blue-detuned op- 
tical lattice provides an overall anti-trapping potential with 
a formally imaginary horizontal trap frequency cji a t that re- 
duces the confinement of the dipole trap, giving an effective 
horizontal trap frequency cJhor = -y^dip + ^fat ■ The scatter- 
ing length during the lattice ramp was set to a = 309(5) ao, 



resulting in a final interaction in the Mott insulating state of 
U/J> 800. This loading sequence is designed to minimize 
doubly occupied sites in the Mott insulating state, as double 
occupancies promote atoms into higher bands when crossing 
the Feshbach resonance [31]. During the lattice ramp, we 
increased the trapping frequencies to cjhor = 27r x 97(4) Hz 
and covert = 27T x 215(13) Hz by changing cjdip, in order to 
increase the fraction of atoms in the Mott insulating core. 

In the deep lattice, where tunneling can essentially be ne- 
glected (tunneling time r = h/2itJ = 10(2) ms), we ramped 
the scattering length within 2 ms to its final value (II) , ei- 
ther repulsive at a = 33(1) ao or attractive at a = —37(1) ao 
(| U/J | > 80) by crossing a Feshbach resonance at a magnetic 
field of 402.50(3)G [16]. At the same time, we decreased the 
horizontal confinement to cjhor — 2tv x 38(4) Hz in the repul- 
sive case or to a maximally anti-trapping potential with a for- 
mally imaginary trapping frequency of |cJhor| = 27T x 49(1) Hz 
in the attractive case by decreasing the power of the red- 
detuned dipole trap. The anti-confining potential is pro- 
vided by the blue-detuned lattice beams. Subsequently, we 
linearly decreased the horizontal lattice depths (III) within 
2.5 ms to Vhor = 6.1(1) -E r , but kept the vertical lattice at 
V V ert = 22(1) E r to avoid effects due to gravity and to enable 
strong anti-trapping potentials (see below). During this lat- 
tice ramp, the horizontal trapping frequencies change only 
slightly. After instantaneously switching off all optical po- 
tentials, we ramped down the homogeneous magnetic field 
within 2 ms and recorded absorption images along the ver- 
tical direction after a total TOF of 7 ms. The chosen TOF 
is a compromise between a sufficient transformation of mo- 
mentum space into real space during TOF and a sufficiently 
large final optical density. 



Inverting the external potential 

In principle, negative temperature states can be created in 
three dimensions by additionally inverting the vertical con- 
finement to an anti-trapping potential. In reality, however, 
inverting the vertical harmonic confinement without invert- 
ing gravity would also invert the gravitational sag, thereby 
creating a strong vertical gradient at the unchanged position 
of the atoms. While this issue could be mitigated by ap- 
plying a suitable vertical magnetic field gradient, we instead 
chose to keep the vertical lattice strong. This has the ad- 
ditional advantage that more intense vertical lattice beams 
provide a stronger anti-trapping potential, enabling a total 
anti-trapping frequency of up to |cJhor| = 2tt x 43(1) Hz. 

For an exact inversion of the interaction and potential en- 
ergy terms in the Hamiltonian, the horizontal confinement 
should have been precisely inverted to an anti-trapping po- 
tential of frequency |cJhor,f| = cJhor,i = 27T x 97 Hz (f: final, i: 
initial) simultaneously with a Feshbach ramp to a negative 
scattering length of af = — ai = — 309 ao and the sequence 
thereafter should have mirrored the loading sequence. In 
the experiment, however, completely inverting the potential 
would only be possible with an additional blue-detuned anti- 
trapping beam. To compensate the resulting mismatch in 
potential energy, we also decreased the interaction strength 
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simultaneously to the harmonic confinement such that the ra- 
tio |[/|/|o;hor| 2 remains approximately constant. If we assume 
that the density distribution before entering the Mott insu- 
lator is in global thermal equilibrium and density is hardly 
redistributed in the deep lattice, the sum of potential and 
mean-field interaction energy should again be approximately 
constant throughout the cloud after ramping back into the 
superfluid regime. This minimizes the effective potential for 
the atoms and the associated dephasing in the final state. We 
note that ramping down the lattice depth within 25 ms, as in 
the initial loading sequence, would result in a comparable fi- 
nal visibility (4% lower than for 2.5 ms) for an anti-trapping 
potential with |cJhor| = 27T x 43(1) Hz. The shown lifetime 
measurements over a large range of |cJhor|, however, are only 
possible with a fast ramp, where dephasing during the ramp 
due to strong effective potentials is less important. 

Quasi-momentum distribution 

In order to measure the quasi-momentum distribution of 
the final states we used a band-mapping technique [18], i.e. 
we linearly ramped down the final lattice in 60 /is followed 
by 7 ms TOF and averaged about 20 images for both the fi- 
nal negative and positive temperature states. We note that 
these images do not directly represent the quasi-momentum 
distribution of the ensembles, but are convolved with the in 
situ density distribution. As the lattice axes along the hor- 
izontal directions are not perfectly orthogonal, we rectified 
the slight asymmetry in the position of the coherence peaks: 
We fitted the four main peak positions of a negative temper- 
ature ensemble and, by applying a shearing transformation, 
mapped them onto the corners of a square. For a reliable fit 
of 2D Bose-Einstein distributions (see below) we also equal- 
ized the varying heights of the four peaks by multiplication 
with a linearly interpolated normalization map. For the pos- 
itive temperature state, we extracted the necessary shearing 
transformation from the positions of the first order coherence 
peaks in an image without band-mapping. We did not rescale 
the height of the single coherence peak in the band-mapped 
image. The resulting images are shown as the upper insets 
in Fig. 3 in the main text. 

After symmetrizing the data, we assigned horizontal quasi- 
momenta g x and q y to the individual pixels of the images, and 
extracted the quasi-momentum distribution n(q^,q y ) in the 
first Brillouin zone, which is still convolved with the initial 
density. From this, we obtained the kinetic energy distribu- 
tion, i.e. the number of atoms with a given kinetic energy, 
by use of the tight-binding dispersion relation at the given 
lattice depth, 

#kin(<?x, q y ) = -2 J [cos (q x d) + cos (q y d)] , (S4) 

where d = Ai a t/2 is the lattice constant. This distribution 
was then normalized by the density of states in the 2D optical 
lattice in order to obtain the occupation p e xp(-Skm) per Bloch 
wave as a function of energy. The result is shown as the blue 
and red data points in Fig. 3 in the main text for the final 
positive and negative temperature states, respectively. 



Bose-Einstein fits 

In order to extract the temperature of the final states, we 
fitted the quasi-momentum distributions, as detailed above, 
by a Bose-Einstein distribution function for the kinetic en- 
ergy, 

U ^ ^ = e (E kin ^, qy )-, )/kB T _ I + ^ ( S5 ) 

which was convolved with a Gaussian (see below). Indepen- 
dent fitting parameters include the chemical potential fi, the 
temperature T and a constant offset o. The results of these 
fits are shown as the lower insets in Fig. 3 in the main text. 
From the fits, the occupation pfit(^kin) of the Bloch waves 
as a function of energy was extracted analogously to the ex- 
perimental data (see above), and is shown as solid curves in 
Fig. 3. 

The very good reproduction of the data with a Bose- 
Einstein distribution function, together with the appearance 
of stable coherence peaks, indicates that the final negative 
and positive temperature states are thermalized. This fitting 
procedure, however, neglects interaction effects and potential 
energy and overestimates the filling by assuming a homoge- 
neous density of one atom per site (see below), while the 
experiment uses an inhomogeneous system with unity fill- 
ing only in the center. Therefore, the fitted temperatures of 
2.7J/&B and — 2.2J//cb are systematically too large, consid- 
ering absolute values, and represent only upper bounds. 

During the final lattice ramp of 2.5 ms from the Mott in- 
sulating to the superfluid regime, not more than 2 integrated 
tunneling events can take place, limiting the overall mass 
transport during ramp-down. To a good approximation, we 
can therefore estimate the final average filling n in the super- 
fluid regime by ft in the Mott insulating state. The average 
filling in the atomic limit depends on the entropy of the cloud, 
and can range for our parameters from ft — 1 in the zero en- 
tropy case to n « 0.7 for S/N w 1.0 /cb and n « 0.5 for 
S/N ~ 1.5 fe. Performing the above fit for average fillings 
of n = 0.7 and n = 0.5 yields temperatures that are 20(3)% 
and 33(5)%, respectively, below the values reported above. 

Fitting details: 

In the fit, we convolved the Bose-Einstein distribution 
function of eq. S5 with an elliptical Gaussian in order to 
account for both the convolution with the initial density dis- 
tribution as well as the vertical extension of the cloud after 
TOF. The ellipticity appears because the vertical lattice axis 
is not perfectly parallel to the imaging axis. We fixed both 
aspect ratio and angle of the elliptical Gaussian to values 
obtained from separate fits of the central peak in a positive 
temperature image. Therefore only the width ctq remains 
as an additional fitting parameter for the Bose-Einstein fit, 
which contains in total five fitting parameters: fi, T, o, ctq, 
and the size Ibz of the first Brillouin zone after TOF. The 
experimental data was normalized to the number of quasi- 
momentum states used in the fit, i.e. the fit corresponds to 
unity filling. 

While monitoring the sum of squared residuals showed a 
good stability of the fit for negative temperatures, it is not 
possible to use <tg as free parameter in the positive tem- 
perature case, because T and ctg are not independent when 
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fitting only a single peak. Assuming identical convolution 
functions for negative and positive temperatures, we instead 
fixed <7q to the value obtained from fitting the negative tem- 
perature image. Similarly, as Ibz cannot be obtained from 
a single peak, it was fixed to the fitted distance of the first 
order coherence peaks in a positive temperature image with- 
out band-mapping. The positive temperature fitting routine 
therefore contains only /i, T, and o as free parameters. 



Critical temperature 

While, contrary to the three-dimensional case, there is no 
condensation into a Bose-Einstein condensate (BEC) at p = 
in 2D free space [32], condensation is nonetheless possible 
for non-interacting bosons in a 2D harmonic trap [33, 34]. 
Here the critical temperature is given by 



6iy 2 D hwho. 

TV 2 k~B 



(S6) 



where N2T> denotes the atom number. 

This formula can be adapted to the lattice case via the 
effective mass approximation for small quasi- momenta [35], 



— 2J[cos(q x d) + cos(q y d)] 
with an effective mass of 



-2J + 



2m e ff 



m e ff ■ 



2Jd 2 ' 



(S7) 



(S8) 



For the employed lattice depth of Viat = 6 E v we obtain 
m e ff/m = 2.008 ~ 2 and can derive the effective trap fre- 
quency in the lattice as 



eff __ 
CU hoT — 



(S9) 



which can be inserted for the bare trap frequency in formula 
S6. 

The atom number in the central layer can be estimated by 
assuming the initial n — 1 Mott insulator in the atomic limit 
to form an ellipsoid of volume V = ^tvR 3 /^ = d 3 N, with 
the radius R in the horizontal directions and the aspect ratio 
7 = covert /oJhor of the trap. The area of the central layer is 



given by A = ttR 2 = tt {^d 3 N/An 



l 2/3 = d 2 JV 2D , 



iV 2 D 



4.6(4) x 10 3 



leading to 



(S10) 



For our parameters, this yields a critical temperature of 
Tc = 3.4(2) J/kB. One expects a quasi-condensate with fluc- 
tuating phase below Tc which only well below Tc turns into 
a true condensate with small fluctuations on distances com- 
parable to the Thomas- Fermi size [36]. We note, however, 
that this theory does not hold for the interacting case in the 
thermodynamic limit, where the BEC transition is replaced 
by a Berezinskii-Kosterlitz-Thouless (BKT) transition into a 
superfluid [37], which is expected to occur at Tc ~ 1.8J/&B 
[19] for the parameters used here. Nonetheless, Monte-Carlo 
calculations have shown that quasi-condensate correlations 



appear well above the critical temperature for the BKT tran- 
sition [381. 



Extraction of coherence length 

In principle, the coherence length l c in the final nega- 
tive temperature state can be directly extracted from the 
peak width in the interference pattern recorded after long 
TOF. After finite TOF, however, the momentum distribu- 
tion of the sample is convolved with the initial spatial dis- 
tribution of the atoms. Following [20], we therefore model 
the one-dimensional (ID) distribution after finite TOF by 
n(k) oc \wo(k)\ 2 S(k), where the envelope function wo(k) is 
the Fourier transform of the on-site Wannier function and 
k — p/h. The interference term 



s(k) = e 



ik{r^-r v )-i^- t {rl-rl)- 



(Sll) 



assumes an initial Gaussian density distribution with the 
standard deviation R = 38 (2) d being the radius of the central 
ID system (see above), and a phase coherence that decays 
exponentially over l c . The coordinate of lattice site fi is in- 
dicated by r M . 




Fig. SI. Extraction of coherence length from background. Red curve: 
measured optical density along a cut through two peaks in the TOF 
image (t = 7 ms). It is scaled to a value of 1 for the lower peak. Blue 
curves: normalized densities extracted from one-dimensional calcula- 
tions assuming R = 38d (dashed curve: R = 28d) and, from light to 
dark, l c = (1, 2, 3, 4, 5, 7, 11, 38)d (dashed curve: l c = 2Sd). 

In Fig. SI we show calculated interference patterns for 
various l c together with experimental data. While the ob- 
served peak width is consistent with phase coherence over 
the whole sample, there are two subtleties to consider: first, 
for large l c the calculated peak width depends critically on 
R, and second, the above ID calculation neglects the other 
dimensions. While these could, in a first approximation, be 
included by averaging over various ID systems with differ- 
ent R, the calculation nonetheless depends crucially on the 
initial in-situ distribution. The contrast of the ID interfer- 
ence signal, on the other hand, depends less strongly on R 
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l c (d) 

Fig. S2. Extraction of coherence length from peak density and back- 
ground. Blue curves: the calculated value n(0) / n(h / X\ at ) for a given 
TOF is plotted versus l c for various radii, from light to dark, of 
R = (8, 18, 28,38,48)d. Red-shaded area: value extracted from the 
experimental data for the range of measured peak densities. 



(see Fig. S2) and is therefore also less sensitive to averaging 
over several systems. From this signal we can deduce a lower 
bound for l c of 3 to 5 lattice sites. Atoms from the interim 
Mott insulating core carry little entropy and are expected 
to evolve into a superfluid for sufficiently slow ramps. On 
the contrary, atoms from the former superfluid/normal shell 
and from double occupancies carry a lot of entropy and will 
show up as an overall |T| = oo Gaussian background in TOF 
images, i.e. they are also expected in the region between the 
interference peaks. This simple model therefore systemati- 
cally underestimates l c in the central region. 



Lifetime of coherence 

In Fig. S3 we present the complete data and exponential 
fits leading to the coherence lifetimes plotted in the inset of 
Fig. 4 in the main text, and in Fig. S4 we show correspond- 
ing sample images. We observe a high stability of the neg- 
ative temperature interference pattern for the optimal anti- 
trapping potential with a frequency of |cJhor| = 27T x 43(1) Hz, 
and also for the positive temperature pattern for the analogue 
trapping potential with frequency |cJhor| = 27T x 45(3) Hz. For 
weaker anti-trapping potentials (|cJhor| = 27T x 22(3) Hz and 
2tt x 6(9) Hz), we not only observe shorter lifetimes of the co- 
herence, but also distortions of the cloud from residual non- 
harmonic terms in the potentials. The observed reduction 
of total atom number which is strongest for the weakest ex- 
ternal potentials can be explained by atoms leaving the trap 
in the horizontal plane. For trapping (i.e. not anti-trapping) 
potentials, where the lifetime is reduced even further, faint 
cross-like structures appear for long hold times, indicating 
that the system can return to large metastable positive tem- 
peratures. 




100 200 300 400 500 600 700 
Hold time in final lattice (ms) 



Fig. S3. Stability of the final states. Data: visibility plotted ver- 
sus hold time in the final state for various horizontal trap frequen- 
cies. Red: attractive interactions, from dark to light, |cJhor|/27r = 
(43(1), 22(3), 6(9)) Hz anti-trapping, (21(4), 42(3), 85(4)) Hz trap- 
ping, blue: repulsive interactions, 45(3) Hz trapping. Solid lines: 
exponential fits. 
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Fig. S4. Sample images of the stability measurement. Shown are TOF images for various horizontal trap frequencies |^hor| f° r negative (red 
borders) and positive (blue borders) temperatures after variable hold times in the final lattice, corresponding to the data in Fig. 4 in the main 
text and in Fig. S3. 
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